Word embeddings improve generalization over lexical features by placing eachword in a lower-dimensional space, using distributional information obtainedfrom unlabeled data. However, the effectiveness of word embeddings fordownstream NLP tasks is limited by out-of-vocabulary (OOV) words, for whichembeddings do not exist. In this paper, we present MIMICK, an approach togenerating OOV word embeddings compositionally, by learning a function fromspellings to distributional embeddings. Unlike prior work, MIMICK does notrequire re-training on the original word embedding corpus; instead, learning isperformed at the type level. Intrinsic and extrinsic evaluations demonstratethe power of this simple approach. On 23 languages, MIMICK improves performanceover a word-based baseline for tagging part-of-speech and morphosyntacticattributes. It is competitive with (and complementary to) a supervisedcharacter-based model in low-resource settings.
展开▼